Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract text with AI command #2191

Merged
merged 1 commit into from
Dec 16, 2024
Merged

Extract text with AI command #2191

merged 1 commit into from
Dec 16, 2024

Conversation

dmitry-zaitsev
Copy link
Collaborator

Introducing a extractTextWithAI command that uses LLM to extract text form a screenshot. Some situations where it is useful:

  • Finding dynamic elements on the screen (i.e. as a result of a search)
  • Finding elements by their semantic meaning rather than hardcoded strings or IDs
  • Bypassing Captcha

Usage Example

- extractTextWithAI: CAPTCHA value
- inputText: ${aiOutput}

@@ -84,7 +84,7 @@ class Orchestra(
private val onCommandStart: (Int, MaestroCommand) -> Unit = { _, _ -> },
private val onCommandComplete: (Int, MaestroCommand) -> Unit = { _, _ -> },
private val onCommandFailed: (Int, MaestroCommand, Throwable) -> ErrorResolution = { _, _, e -> throw e },
private val onCommandWarned: (Int, MaestroCommand) -> Unit = { _, _ -> },
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bit of auto-reformatting in this class. The main relevant change is extractTextWithAICommand method

@dmitry-zaitsev
Copy link
Collaborator Author

Not sure why detekt pushed so many styling changes 🤷 Perhaps these classes were not touched since the style changed?

Copy link
Contributor

@felipeduartea felipeduartea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! 😄

@@ -413,6 +415,26 @@ class Orchestra(
false
}

private fun extractTextWithAICommand(command: ExtractTextWithAICommand): Boolean = runBlocking {
// Extract text from the screen using AI
if (ai == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Although this ai variable was not added now, it probably deserves a better name

@dmitry-zaitsev dmitry-zaitsev merged commit 4848e9f into main Dec 16, 2024
6 of 8 checks passed
@dmitry-zaitsev dmitry-zaitsev deleted the extract-text-with-ai branch December 16, 2024 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants